Goto

Collaborating Authors

 translate speech


The Download: what's next for Neuralink, and Meta's language translation AI

MIT Technology Review

In November, a young man named Noland Arbaugh announced he'd be livestreaming from his home for three days straight. His broadcast was in some ways typical fare: a backyard tour, video games, meet mom. The difference is that Arbaugh, who is paralyzed, has thin electrode-studded wires installed in his brain, which he used to move a computer mouse on a screen, click menus, and play chess. The implant, called N1, was installed last year by neurosurgeons working with Neuralink, Elon Musk's brain-interface company. Arbaugh's livestream is an indicator that Neuralink is a whole lot closer to creating a plug-and-play experience that can restore people's daily ability to roam the web and play games, giving them what the company has called "digital freedom." But this is not yet a commercial product.


Meta's new AI model can translate speech from more than 100 languages

MIT Technology Review

"Meta has done a great job having a breadth of different things they support, like text-to-speech, speech-to-text, even automatic speech recognition," says Chetan Jaiswal, a professor of computer science at Quinnipiac University, who was not involved in the research. "The mere number of languages they are supporting is a tremendous achievement." Human translators are still a vital part of the translation process, the researchers say in the paper, because they can grapple with diverse cultural contexts and make sure the same meaning is conveyed from one language into another. This step is important, says Lynne Bowker, Canada Research Chair in Translation, Technologies and Society at Université Laval in Quebec, who didn't work on Seamless. "Languages are a reflection of cultures, and cultures have their own ways of knowing things," she says.


Microsoft teases AI interpreter that can translate speech in real time

PCWorld

Imagine being in a meeting, perhaps with a client or with other team members, and not quite sharing the same preferred languages. Communication barriers can cause all kinds of misunderstandings and issues when collaborating, and a translator isn't always available. At Ignite 2024, Microsoft just unveiled a cool new feature for Microsoft Teams that uses artificial intelligence to not only translate words from one language to another in real time, but even emulate the speaker's voice, intonations, and mannerisms. Microsoft says a preview version of this "AI interpreter" feature will come some time in early 2025, but it remains unknown when the feature will be released more generally across all versions of Teams. The Teams interpreter will initially support nine languages: English, Chinese (Mandarin), Korean, German, Italian, Japanese, French, Spanish, and Brazilian Portuguese.


Using AI to Translate Speech For a Primarily Oral Language

#artificialintelligence

AI-powered speech translation has mainly focused on written languages, yet nearly 3,500 living languages are primarily spoken and don't have a widely used writing system. This makes it impossible to build machine translation tools using standard techniques, which require large amounts of written text in order to train an AI model. To address this challenge, we've built the first AI-powered speech-to-speech translation system for Hokkien, a primarily oral language that's widely spoken within the Chinese diaspora but lacks a standard written form. We're open-sourcing our Hokkien translation models, evaluation datasets and research papers so that others can reproduce and build on our work. The translation system is part of our Universal Speech Translator project, which is developing new AI methods that we hope will eventually allow real-time speech-to-speech translation across many languages.


Speechmatics raises $62M for its inclusive approach to speech-to-text AI – TechCrunch

#artificialintelligence

Last week I wrote about an AI startup that's building technology that can alter, in real time, the accent of someone's speech. But what if the AI goal instead is to make it possible for people speaking in whatever way they do, to be understood just as they are, and to remove some of the bias inherent in a lot of AI systems in the process? There's a major need for that, too, and now a UK startup called Speechmatics -- which has built AI to translate speech to text, regardless of the accent or how the person speaks -- is announcing $62 million in funding to expand its business. Susquehanna Growth Equity out of the U.S. led the round with UK investors AlbionVC and IQ Capital also participating. This is Series B is a big step up for Speechmatics.


Waibel Elected a Fellow of the International Speech Communication Association

CMU School of Computer Science

Alex Waibel, a professor in Carnegie Mellon University's Language Technologies Institute, has been elected a fellow of the International Speech Communication Association (ISCA). The ISCA recognized Waibel for his pioneering contributions in multilingual and multimodal spoken language processing and translation. Waibel, also faculty at the Karlsruhe Institute of Technology in Germany, has worked on speech and machine translation for decades, developing systems that now can translate speech in real time. Waibel demonstrated the first speech translation systems in the 1990s and 2000s. By 2020, he had developed a system that outperformed humans in recognizing conversational speech on a public benchmark.


Machine Learning Behind Google Translate Services - AI Summary

#artificialintelligence

During the initial days, Google Translate was launched with Phrase-Based Machine Translation as the key algorithm. The main improvement in the translation systems was achieved with the introduction of Google Neural Machine Translation or GNMT . With Translatotron, Google demonstrated that a single sequence-to-sequence model can directly translate speech from one language into speech in another language, without the need for intermediate text representation, unlike cascaded systems. Translatotron is claimed to be the first end-to-end model that could directly translate speech from one language into speech in another language and was also able to retain the source speaker's voice in the translated speech. Stay updated on last news about Artificial Intelligence.


AI localization tool claims to translate your words in your voice

Engadget

Localization is a tricky issue for all content creators. It can take significant time and resources to make their work fully accessible to folks who speak different languages. One company thinks it has cracked part of that code with an artificial intelligence system that automatically translates speech into other languages in the same speaker's voice. Resemble AI says its Localize tool can keep voices consistent in various languages in movies, games, audiobooks, corporate videos and other formats. Google is working on similar tech, but we haven't heard much about that since it published a paper on the Translatotron system last year.


Amazing Google AI speaks another language in your voice

#artificialintelligence

On Wednesday, Google unveiled Translatotron, an in-development speech-to-speech translation system. It's not the first system to translate speech from one language to another, but Google designed Translatotron to do something other systems can't: retain the original speaker's voice in the translated audio. In other words, the tech could make it sound like you're speaking a language you don't know -- a remarkable step forward on the path to breaking down the global language barrier. According to Google's AI blog, most speech-to-speech translation systems follow a three-step process. First they transcribe the speech.


Google's latest Translate function turns speech of one dialect directly into another

Daily Mail - Science & tech

Google has announced a new translate tool which convert sone language into another and preserves the speaker's original voice. The tech giant's new system works without the need to convert it to text before. A first-of-its-kind, the tool is able to do this while retaining the voice of the original speaker and making it sound'more realistic', the tech giant said. Google claims the system, dubbed'Translatotron', will be able to retain the voice of the original speaker after translation while also understanding words better. Google has announced that their new translate tool will convert one language into another without the intermediate text-based process. The first of its kind tool is able to do this while retaining the voice of the original speaker and making it sound'more realistic' It can directly translate speech from one language into speech in another language, without relying on the intermediate text representation in either language, as is required in cascaded systems.